Auditory spectrum based features (ASBF) for robust speech recognition

نویسندگان

  • Chi H. Yim
  • Oscar C. Au
  • Wanggen Wan
  • Cyan L. Keung
  • Carrson C. Fung
چکیده

MFCC are features commonly used in speech recognition systems today. The recognition accuracy of systems using MFCC is known to be high in clean speech environment, but it drops greatly in noisy environment. In this paper, we propose new features called the auditory spectrum based features (ASBF) that are based on the cochlear model of the human auditory system. These new features can track the formants and the selection scheme of these features is based on the second order difference cochlear model and the primary auditory nerve processing model. In our experiment, the performance of MFCC and the ASBF are compared in clean and noisy environments. The results suggest that the ASBF are much more robust to noise than MFCC.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Auditory model based speech recognition in noisy environment

The main purpose of this paper is to present how to raise the speech recognition performance in noisy environment. So far the most popularly used speech feature in speech recognition is probably the so-called MFCC. The recognition rate of speech recognition algorithm using MFCC and CDHMM is known to be very high in clean speech environment, but it deteriorates greatly in noisy environment, espe...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

In this paper, we present robust feature extractors that incorporate a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, to estimate the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high var...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Auditory Contrast Spectrum for Robust Speech Recognition

Traditional speech representations are based on power spectrum which is obtained by energy integration from many frequency bands. Such representations are sensitive to noise since noise energy distributed in a wide frequency band may deteriorate speech representations. Inspired by the contrast sensitive mechanism in auditory neural processing, in this paper, we propose an auditory contrast spec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000